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Abstract. Eigenvalue estimates that are optimal in some sense have self- 
evident appeal and leave estimators with a sense of virtue and economy. 
So, it is natural that ongoing searches for effective strategies for difficult 
tasks such as estimating matrix eigenvalues that are situated well into the 
interior of the spectrum revisit from time to time methods that are known 
to yield optimal bounds. This article reviews a variety of results related to 
obtaining optimal bounds to matrix eigenvalues — some results are well- 
known; others are less known; and a few are new. We focus especially 
on Ritz and harmonic Ritz values, and right- and left-definite variants of 
Lehmann's method. 



1. Ritz and Related Values 

Let K and M be n x n real symmetric positive definite matrices and consider 
the eigenvalue problem 

(1.1) Kx = AMx 

Label the eigenvalues from the edges toward the center (following [|l6|) as 

Ai < A2 < A3 < • • • < A_3 < A_2 < A^i 
with labeling inherited by the associated eigenvectors: xi, X2, x„2> x„i. 



Solutions to (LI) are evidently eigenvalue/eigenvector pairs of the matrix 
M-^K, which is non-symmetric on the face of it. However, M is self- 
adjoint with respect to the both the M-inner product, x*Mx, and the K- 
inner product, x*Kx. Denote by x"^ the M-adjoint of a vector x, x"^ = x*M, 
and by x'^ the K-adjoint, x*^ = x*K. "Self-adjointness" of M~^K amounts 
to the assertion that for all x and y, x'^(M~^Ky) = (M~^Kx)'"y and 
x'^(M~"'^Ky) = (M^"'^Kx)'^y. Self-adjointness with respect to the M- and 
K-inner products implies that the matrix representation of M^^K with re- 
spect to any M-orthogonal or K-orthogonal basis will be symmetric. 

For a given subspace V of dimension m < n, the Rayleigh-Ritz method 
proceeds by selecting a basis for V, say constituting the columns of a matrix 
P £ M"^™-, and then considering the (smaller) eigenvalue problem 

(1.2) P*KPy = A P*MPy. 

This will yield m eigenvalues (called Ritz values) labeled similarly to {A^} as 

Ai < A2 < A3 < • • • < A_3 < A_2 < A_i 
1 
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with corresponding eigenvectors yi, y2, ... y_2, y-i- Vectors in V given 
as Ufc = Py/j are Ritz vectors associated with the Ritz values A^. Since 
W" = span(yi, y2, y_2, y-i), the full set of Ritz vectors evidently 

forms a basis for V, which is both K-orthogonal and M-orthogonal and may 
be presumed to be M-normalized without loss of generality: u^Uj = and 
uJ^Uj = for i 7^ j, and uj"uj = 1. 

Harmonic Ritz values [^] result from applying the Rayleigh-Ritz method 
to the eigenvalue problem 

(1.3) KM^^Kx = AKx, 



which is equivalent to (1.1) — it has the same eigenvalues and eigenvectors. If 
we use the same subspace V, the harmonic Ritz values are then the eigenvalues 
of the m X m problem 

(1.4) P*KM^KPy = AP*KPy, 

yielding 

Ai < A2 < A3 < • • • < A_3 < A_2 < A_i. 
Just as Ritz values are weighted means of the eigenvalues of the matrix, har- 
monic Ritz values are harmonic means of the eigenvalues of the matrix. 

Quantities which will be introduced here (for lack of a better name) as dual 
harmonic Ritz values result from applying the Rayleigh-Ritz method to the 
eigenvalue problem 

(1.5) Mx = AMK^^Mx, 

which is also equivalent to ( p..ip , in the sense of having the same eigenvalues 
and eigenvectors. If we use the same approximating subspace V, the dual 
harmonic Ritz values are the eigenvalues of the m x m problem 

(1.6) P*MPy = AP*MK"^MPy, 
yielding 

Ai < A2 < A3 < • • • < A_3 < A_2 < A_i. 

Dual harmonic Ritz values are also harmonic means of the matrix eigenvalues, 
however with a different weighting than for harmonic Ritz values. 

Both harmonic Ritz and dual harmonic Ritz values were known even 50 
years ago and found to be useful in differential eigenvalue problems — Collatz 



referred to the harmonic Ritz problem (1^) as Grammel's equations (citing 
Grammel's earlier work [^) and viewed the Rayleigh quotients for the Ritz 
problem (L^), the harmonic Ritz problem ( |1.4D , and the dual harmonic Ritz 
problem (Lq), all as elements of an infinite monotone sequence of "Schwarz 
quotients" that could be generated iteratively. 

As long as K and M are positive definite, all three of Ritz, harmonic Ritz, 
and dual harmonic Ritz values provide "inner" bounds to the "outer" eigen- 



values of the pencil K — AM (that is, of the problem (1.1)). In comparing 



the three types of approximations using the same subspace V, harmonic Ritz 
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tions of the spectrum. 



values provide the best bounds of the three to the upper eigenvalues of 
dual harmonic Ritz values provide the best bounds of the three to the lower 
eigenvalues. As an example, Figure 1 shows bounds obtained for a sequence 
of nested Krylov subspaces taken for V, with K = diag{[l : 2 : 100]), M = I, 
and a starting vector of all ones (the example of p^]). 
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Theorem 1.1. Suppose K and M are positive definite. Then 
Afc < Afc < Afc < Afc for k = 1, 2, . . . 
l_i < A_e < A^e < \-i for £ = 1,2,... 
Proof: The min-max characterization yields 

^ x*Kx x*Kx 

Afc = mm max — < mm max — 

dim5=fc xG5 X*Mx dim5=fc xe5 x'Mx 

y*P*KPy ^ 

= mm max — — — — — — = Afc, 

dim7e=fc yGT^ y*P*MPy 

and hkewise, 

x*KM ^Kx x*KM^Kx 



Afc = min max < min max ■ 



dim<S=fc xe5 X*Kx dim<S=fc xe5 X*Kx 

SdV 

y*P*KM ^KPy ~ 

= mm max -— — — = Afc. 

dim7e=fc yen y*P*KPy 

A similar argument shows Afc < Afc. By repeating the argument for the eigen- 
value problem — Kx = (— A)Mx, one finds — A^(— K, M) < — A_£ (where 
A(A, B) is used to denote an eigenvalue of the pencil A — AB). Notice that 
-A£(-K, M) = A_£(K, M). Thus, A_^ < A_^ and A_^ < A_^ 
For any x € M", the Cauchy-Schwarz inequality implies 

(x*Kx)2 =(x*KM"^/2]y[i/2^)2 < x*KM"^Kx x*Mx 
and (x*Mx)2 =(x*MK"^/2k^/2x)2 < x*MK"^Mx x*Kx 

Thus, 

x*Mx x*Kx x*KM~iKx 

x*MK-iMx - x*Mx - x*Kx ' 
which then implies for each A; = 1, 2, . . . , m 

~ x*Mx 
< Afc < Afc = min max 



dim5=fc xg5 x*MK"iMx 

SdV 

x*Kx 

< mm max — = Afc 

dim5=fc xG5 X*Mx 

SdV 

x*KM-iKx J _ 

< mm max = Afc ■ 

dim5=fc xe5 x'Kx 
SdV 

The situation is somewhat different if K is indefinite. The Ritz estimates 
are still "inner" bounds, that is Afc < Afc and A_^ < A_^. However, both 
harmonic Ritz and dual harmonic Ritz values now provide "outer" bounds 
{lower bounds) to negative eigenvalues of ( pTl] ) and no simple relationship is 
known that would predict which of the three bounds is best (essentially owing 
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to there being no simple analog of the Cauchy-Schwarz inequality for indefinite 
inner products). 

Despite the differences in behavior described above, Ritz, harmonic Ritz, 
and dual harmonic Ritz values each provide optimal bounds - obviously each 
with respect to a slightly different notion of optimality. For the Ritz problem, 
the matrices P*KP and P*MP provide a "sampling" of the full matrices K 
and M on the subspace V. Whatever spectral information about the original 
eigenvalue problem ( |1.1D that we are able to deduce by examining the Rayleigh- 
Ritz problem we must draw the same conclusions for all matrix pencils 

that are "aliased" by the Rayleigh-Ritz sampling. Define the following set of 
such n X n matrix pairs: 



CiV) 



(A, B) 



A and B are positive definite 
P*(A - K)P = 
P*(B -M)P = 



Theorem 1.2. For any choice of positive integers u, vr with u + n = m and 
any choice of matrix pairs (A, B) € C{V) 

Xk{A, B)<Ak for k = l, 2, u 
< A_^(A, B) for ^ = 1,2,..., vr. 

Furthermore, for each index pair u, n, there exists a matrix pair (A, B) G 
C{V) such that 



Xk{A, B) = Afc for k = l, 2, u 
= A_^(A, B) for 1 = 1,2,..., vr. 

So, no better hounds are possible with only the information available to the 



Rayleigh-Ritz method as described by (l.i.). 



Proof: The first assertion is a restatement of Theorem 1.1 for the matrix 



pencil A — AB. To show optimality, define the matrix of Ritz vectors: U = 
[ui, U2, ...,VL.y, u_^, u_2, u_i]. Notice that U is an M-orthonormal 
basis for V: U*MU = I. Define also the diagonal matrix of Ritz values 

Ai 



D 



A. 



A_ 



A_i 



and fix A = + A_7r). Now, consider 



A = MUDU*M + A(M - MUU*M) and B = M. 
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One may verify that all required conditions are satisfied, in particular 

(A - AB)U = MU(D - AI) 

and for any v G M" with v*MU = 0, 

(A-AB)v = Mv(A-A). ■ 

A similar construction can be used to show the (analogously defined) optimal- 
ity of harmonic Ritz values and dual harmonic Ritz values. 

As we will see in following sections, Ritz values, harmonic Ritz values, 
and dual harmonic Ritz values are limiting cases of parameterized families of 
bounds arising from "left-definite" and "right-definite" Lehmann intervals. 



P 



iboo 



Dual Harmonic 
Ritz Values 





Left-definite 
Lehmann Bounds J 



I Right-definite \ 
I Lehmann bounds J 




Harmonic 
Ritz Values 



2. Lehmann's Optimal Intervals 

Each of the Ritz-related methods discussed above will have certain advan- 
tages in estimating the extreme eigenvalues of (|1.1|). None are particularly 
effective in estimating interior eigenvalues, however. Usual strategies for ob- 
taining accurate estimates to the eigenvalues of (|l . l]) lying close to a given value 
p involve a spectral mapping that turns the spectrum "inside out" around p 
— mapping interior eigenvalues in the neighborhood of p to extreme eigenval- 
ues that are more accessible. "Shift and invert" strategies typically use the 
spectral mapping A ^ A variant used especially for buckling problems 

(where M may be singular) utilizes instead the spectral mapping A i-^ -j^. 
As we shall see, both of these spectral mappings play a fundamental role in 
the optimal bounds discovered by Lehmann (|11|, |12| , fl^ ). The derivation 
used here is in the spirit of that given by Maehly in |14|] and the associated 
methods are sometimes called Lehmann-Maehly methods. 

Fix a scalar p that is not an eigenvalue of ( |1.1| ) and define the index r to 
satisfy 



(2.1) 



Ar-l < p < A^. 



The right- definite Lehmann method follows first from considering the spec- 
tral mapping A i-^ and an associated eigenvalue problem equivalent to 
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(2.2) M(K - pMl^^Mx = -^Mx, 

X- p 

which has eigenvalues distributed as 

1 1 1 

< < • • • < < • • • < < 



Ar-i — p K-2 — p K+1 — p K — p 

Notice that eigenvalues of (^]^) flanking p are mapped to extremal eigenval- 



ues of (2.2). Now use an m-dimensional subspace S = i?an(S) to generate 
Rayleigh-Ritz estimates for the eigenvalues of ( |2.2D : 

(2.3) [S*M(K - pM)-iMS]y = R [S*MS]y, 

where S G M"^™". Suppose ( ^^ ) has u negative eigenvalues i?i < • • • < i?,^ < 
and IT = m — u positive eigenvalues < R-n < • • • < R-i- Regardless of the 
subspace S that is chosen, the min-max principle (or Theorem |1.1| ) guarantees 
that for each k = 1, 2, . . . , and i = 1, 2, . . . , tt 

< Rk and R-£ < 



K-k — p K+i-i — p 

Rearrange and introduce 



(2.4) All! = p + — < and Xr+e-i < p + 75— = A; 



for k = 1, 2, . . . , z/ and £ = 1, 2, ...,7r. Notice that labeling of A^^^ is 
arranged relative to p: 

. . . AL^g) < aL? < aL? < p < Af^ < Af ) < Af ) . . . 



An equivalent statement combining ( p.lD and ( |2.4D is 



Each of the intervals [A^-*, p) and {p, A^^^] contain respectively k 
and i eigenvalues of ( |1.1| ) for = 1, 2, . . . ,1/ and £ = I, 2, . . . , tt. 



To avoid the need in ( |2.3D for solving linear systems having the indefinite 
coefficient matrix (K — pM), change variables in as P = (K-/)M)-iMS 
— which then implicitly determines S via a choice of V. ( |2.3D can then be 
rewritten as 

(2.5) [P*(K - pM)P]y = R [P*(K - pM)M"^(K - pM)P]y 

When dimV = 1, ( |2.4[) becomes Temple's inequality 



p*(K - pM)M-i(K - pM)p _ p*(KM-iK - pK)p 
^ ^ p*(K-pM)p ~ p*(K-/)M)p - ''^^ 

Some additional notation will reduce the impending clutter of symbols. 
Introduce matrices of Schwarz constants: 

Ho = [P*KM"^KP], Hi = [P*KP], and H2 = [P*MP]. 
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Then expanding out the various terms, ( p. 5] ) becomes 

(2.6) [Hi - pH2]y = [Ho - 2pU^ + p^HsJy 
which may be rearranged to obtain 

(2.7) [Ho - pUi]y = a(^) [Hi - /^Hsjy 

Notice that ( p.7| ) could be written in terms of the M-inner product as 

(2.8) P"'[{M-^Kf - p(M-iK)]Py = A^^) P"^[(M-1k) - pI]Py 
or in terms of the K-inner product as 

(2.9) P^[(M-iK) - pI]Py = A(^) P^[I - p(M-^K)-^]Py 



The left-definite Lehmann method can be obtained by considering the spec- 
tral mapping A i— > and an associated eigenvalue problem — also equivalent 
to (O): 



(2.10) K(K - pM)-^Kx = Kx 

X- p 

which has eigenvalues distributed as 

(2.11) -A^<^rz^<...<Oandl<.--<^^<^^ 

Ar-l - P Xr^2 - P Xr+l - P Xr - p 

(as long as both K and M are positive definite, no eigenvalue gets mapped into 
the interval [0, 1]). Again the eigenvalues of ( |1.1[ ) flanking p are mapped to ex- 
tremal eigenvalues of (2.10). Using an m-dimensional subspace T = Ran(T), 



one may generate Rayleigh-Ritz estimates for the eigenvalues of ( 2.101 ): 
(2.12) [T*K(K - pM)-^KT]y = L[T*KT]y, 

where T G M"^"*. 



If ( 2.12 ) has u negative eigenvalues Li < L2 < ■ ■ ■ < < and tt = m — u 
positive eigenvalues 1 < L.^r < ••• < L^2 < ^-i, then regardless of the 
subspace T that is chosen, the min-max principle (or again. Theorem 1.1) 
guarantees that 

(2.13) ^ld^<Lk and L_, < ^^^'^^ 



Xr-k — P Xr+£-i — p 

or equivalently 

(2.14) A^^' = p - — < Xr-k and A^+^-i < p - — = A) 

i — Lk i — 

for A; = 1, 2, . . . ,u and i = I, 2, ... ,tt. Just as for A^^\ the labeling of A^^^) 
is done relative to p: 

. . . < Ai^^ < Ai^} < p < a1^) < a(^) < Af ) . . . 
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An equivalent statement combining and ( |2.14| ) is 

Each of the intervals [A^^) P) ^i^d {p, A^^^] contain respectively k 
and I eigenvalues of ( |1.1| ) for = 1, 2, . . . ,v and ^ = 1, 2, . . . , tt. 

As before, in order to avoid solving systems with the coefficient matrix 
(K - pM), change variables in ( pTT^ ) as P = (K - pM)-'^K.T which then 
implicitly determines T via a choice of P. (l2l^ ) can then be rewritten as 

(2.15) [P*(K - pM)P]y = L [P*(K - pM)K"i(K - pM)P]y. 
Introduce 

Ha = [P*MK^MP]. 

Then ( |2.15 ) becomes 

(2.16) [Hi - pH2]y = L[Hi - 2pH2 + p^HsJy. 
which may be rearranged to get 

(2.17) [Hi - pH2]y = A(^) [H2 - pHsJy. 



Observe that both ( |2.6| ) and ( 2.16| ) are Hermitian definite pencils with the 



same left-hand side. By the Sylvester Law of Inertia, they each have the 
same number of negative (and hence positive) eigenvalues. If a shift of p = 
is chosen in (2^), the harmonic Ritz problem (|1.4| ) is obtained and A^ = 



A^'^^ |p=o- As p —>■ ibcxD, ( |2.7| ) reduces to the Ritz problem (|L^). Similarly, 



if a shift of p = is chosen in (2.17), the Ritz problem ( |1.2|) is obtained 
and Ai = A^ '^^ |p=o • As p ^ ±00, ( |2.17| ) reduces to the dual harmonic Ritz 
problem (|1.6| ). 

The left- and right-definite Lehmann bounds, A^^^ and A^^^, that are below 
the parameter p are monotone increasing with respect to p. This is easy to 
show for p satisfying ( |2.1[ ), however as p is increased further, r changes and 
the labeling of A^^) and A^-^^ shifts. This more complicated circumstance is 
dicussed in |l^ where a proof of monotonicity in the general case may be 
found. 



Notice that ( 2.17 ) could be obtained formally from the right-definite method 
expressed in (2^) by direct substitution of the M-inner product for the K- 
inner product. 

(2.18) P'^[(M-iK) - pI]Py = A(^) P'"[I - p{M~^K)-^]Py 

Such a substitution also converts the harmonic Ritz problem into a Ritz 
problem and the Ritz problem, then into a dual harmonic Ritz problem. 
This provides some impetus to call the "left-definite Lehmann" method the 
"harmonic Lehmann" method, but Lehmann himself referred to this method 
as "left-definite" and besides the correspondences are a bit backward since 



10 



CHRISTOPHER BEATTIE 



(right-definite) Lehmann is to Ritz as "dual harmonic Ritz" is to " harmonic 
Lehmann." 



3. An Alternative Formulation 

Kahan developed a formulation of Lehmann's right-definite method that 
is particularly well-suited to many computational settings for matrix eigen- 
value problems (cf. [jl6|, Chap. 10). We review the development here and 
extend it to Lehmann's left-definite method. For a given m-dimensional sub- 
space V, suppose the columns of Qi provide an M-orthonormal basis for V: 
span{Qi) = V and Q'|"Qi = QiMQi = I. Define H from the "residual 
orthogonality" condition 

(M-^KQi - QiH)*MQi = 

so that H = Q*KQi and observe (say, from the Gram-Schmidt process) that 
there is an upper triangular matrix C and a matrix Q2 with M-orthonormal 
columns so that 

Q2C = M iRQi-QiH. 

Pick Q3 to fill out an M-orthonormal basis for M" in conjunction with Qi and 
Q2. Then with Q = [Qi Q2 Q3], we have Q*MQ = I and 



M-^KQ = Q 



H C* 
C Vn V*i 
V21 V22 



where 

H is m X m 
Vii is k X k. 



While this shows how H and C might be constructed (essentially one step of a 
block Lanczos process), there may be other situations of interest when H and 
C are known a priori. In any case, we assume that the bottom right block 
2x2 submatrix, V, is either unknown or at least unpleasant to deal with. 
With additional unitary massage, rank{C) = k could be assumed (possibly 
resulting in a smaller Vn), though it isn't necessary in what follows. The 
situation rank{C) = k <^ m <^ n is common. What follows is a deus ex 
machina development of Kahan's formulation of Lehmann bounds that offers 
brevity but little of the insight and revelation that one may find in the excellent 
discussion of (fl^. Chapter 10). 

Apply the right-definite Lehmann bounds from ( p.5| ) using P = Qi. Then, 
(K — pM)P = Qi(H — pi) -|- Q2C and the right-definite Lehmann problem 
(|2.6|) appears as 



(3.1) (H - pl)y = R [{n- pif + C*C] y 

The associated right-definite bound is A^-^^ = p + l/R and we may manipulate 
(p.lj) to get an equivalent condition on A^^^: 



(3.2) = [(H - pI)(H - A(^)I) + C*C 
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One may recognize that the coefficient matrix of ( p.2[ ) is a Schur complement 
of the (m + A;) X (m + k) matrix 



y(a(-^)) 



-(H-pI)(H 
C 



I 



Hence, (3.2) has a non-trivial solution if and only if Y(A(^^) is singular. Sup- 
pose that neither p nor 

Aim 

are eigenvalues of H for the time being and 

define 



A(R)r 



and D(A(^)) 



C(H-/,I)-i(H 

I 

C(H-A(^)I)-i I 

-(H-pI)~i 

(p-A(^))I 



-1 



Then 



H 



l,d(a(«))l,y(a(«)l;l^ - ^ C pI + C(H-,I)-.C' 

Thus A(^) is an eigenvalue of the (m + A;) x {m + k) matrix 



- A(^)I 

C pl + C(H 



(3.3) 



H C* 

C />I + C(H-pI)-iC* 



if and only if either D(A(-^)) is singular or Y(A(-^)) is singular, which is to 
say, if and only if either A^-^^ is a right-definite Lehmann bound satisfying 
(KT) or A(^) = p (which will occur with multiplicity k). A limiting argument 
can be mustered to handle the exceptional cases where either p or A(^) are 
eigenvalues of H. In situations where either the smaller eigenvalues of (|1.1[) 
are of interest or ||C|| is much smaller than ||H||, finding the eigenvalues of 
( |3.3D is likely to yield substantially more accurate results for A^^^ then a direct 
attack on (^]^). A similar formulation for left-definite Lehmann problems will 
be described below. 

Consider the application of the left-definite problem (2.16) with P = Qi. 
Note that KQi = QiH + Q2C impfies that 



r-l 



K 'Qi = QiH^ - K^QaCH 

so then 

(3.4) Q*iK"1Qi = H"^ + H^^C'WCH-^ 
where W = QgK^^Qa has been introduced. ( |2.16| ) becomes 

(3.5) (H - pl)y = L [(H - pi) - p(I - p{H-' + H-^C^WCH-i))] y. 



12 



CHRISTOPHER BEATTIE 



The associated left-definite bound is A^^^ = —pL/{l — L) and we may manip- 
ulate ( |3.5| ) to get an equivalent condition on A^^^^ : 



(3.6) 



0= (H-pI)(H- A(^)l)H + pA(^)C*WC y. 
^ has a non-trivial solution if and only if the {m + k) x {m + k) matrix 
^,^L). _ r -(H - pI)(H - AWi)H A^C* 

is singular. Suppose that neither p nor are eigenvalues of H and define 

F = (H - /9i)-^(H - K^^hy^n-^ 



I 

pCY 



Ui 



A^FC* 
I 



and D(A(^)) 



I 

C(H-A(^)l)-i I 

-(H-/3l)-iH-i 

(p-A(^))//9l 



H - A(^)l 
C 



Then 

(3.7) L2D(A(^))LiY(A(-^))UiL 

where Ni = W'^ + CH-^C* and N2 = C(H - pl)-^CK Thus A^^) is an 
eigenvalue of an auxiliary (m + k) x (m + k) matrix pencil — not unlike 
the right-definite case. This matrix pencil will be definite when Ni — N2 is 
positive-definite, which in turn can be guaranteed when the (r — l)st Ritz 
value is a sufficiently accurate approximation to A^-i: 

Theorem 3.1. Suppose p is not an eigenvalue of Each interval [A^\ p) 

and {p, A^.^^] contains respectively at least i and j eigenvalues of (1-1), where 



< AL^j < • • • < A'^^ < AL7 <p< AY'' < A^"^ < . . . 
are the positive eigenvalues of the [m + /c) x (m + k) matrix pencil 

(3.8) 



and 



H 


c* ■ 




"I " 


C 


Ni 


Ml 



where 



Ml 

Ni 
N2 



1 

P 



(Ni - N2) 



+ CH^C*, 
C(H-pI)-^C*. 



p is an eigenvalue of ^3.q ) with multiplicity k. If the Ritz value A,.^i < p, 
then Ml is positive definite and ^S.Sj) is a Hermitian definite pencil. 
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Proof: The first assertion follows immediately from (3.7), since then A(^) 
is an eigenvalue of (|3l8|) if and only if either D(A('^)) is sing ular or Y(A(^)) IS 
singular. As before a limiting argument handles the exceptional cases where 
either p or A*^^^ are eigenvalues of H. 

For the second statement, note that A^-i < p implies from the way that r 
was chosen in (^]^) that H — /ol has precisely r — 1 negative eigenvalues. Note 
then that Ni — N2 is positive-definite if and only if the matrix 



(3.9) 



H - /)I)H 







Ni -N, 



has precisely r — 1 negative eigenvalues. Define 



Li = 
L2 



pCH-i(H-pI)-i 

I -C*W 
I 



and D 



I 



and calculate with F 
(3.10) F 



i(H-pI)H 







DL2L1 



Ni -N2 



p{l - /3P*K-ip) 

w-i 



Suppose ( |3.9| ) had more than r — 1 negative eigenvalues. Then ( |3.10| ) has more 
than r — 1 negative eigenvalues and therefore I — pP*K^^P has more than r — 1 
negative eigenvalues. Equivalently, this means that P*K~ip has r or more 
eigenvalues above 1/ p. Since the eigenvalues of P*K~^P provide inner bounds 
to the outer eigenvalues of K~^, this implies in turn that must have r 
or more eigenvalues above 1/ p. But this contradicts the choice of p made in 
(P). ■ 

The calculation of W = Q2K -'^Q2 involves the solution of k linear systems 
each of the form Kx = b. If these systems are solved inexactly (one rarely has 
other options), reasonable concerns arise about the integrity of the resulting 
bounds. Rigorous inclusion intervals can be maintained if the approximate 
calculation of W can be made to have the effect of replacing W with a matrix 
W > W (i.e., so that W — W is positive definite). To see this, observe that 
with the replacement of W for W (|3.5D becomes 



(3.11) (H-pI)y = L [(H-/9l)-p(I-/)(H-i + H-^C*WCH~^)) 



The right-hand side of (f 



^) has been replaced with a larger right-hand side 

will have 



in (3.11). The left hand side remains the same, so (3.11) and ( 



the same numbers of positive (vr) and negative {u) 
characterization then may be used to show that 



eigenvalues. The min-max 



Lk < Lk < for 
< L_f < L_f for 



k = l. 



TT. 



14 



CHRISTOPHER BEATTIE 



The inequalities of ( ^.13 ) remain valid if replaces and L_£ replaces L_£ 

-pL^j/(l — L^i), the usual labeling is retained 



Likewise if we define A 



{L) 
±i 



. . . A^"^ < A^^i < A 



1 < P < a(^) < A^-^ < A3 



and A^^ < A^^ each k = 1, . . . , u. The situation regarding the positively 
indexed A^^) that yield bounds above p is slightly more complicated since 
it may occur that L_£ < 1 < L_£ which would then imply that A^^^ < 
0. In effect, A^^^ has "wrapped around" the point at infinity, yielding only 

trivial bounds for A^+^-i- Nontrivial bounds are retained whenever A^^^ > 0, 
however. 

Now, much the same development that yielded Theorem 3T may be followed 
with W replacing W. This is summarized as 



Theorem 3.2. Suppose p is not an eigenvalue of Each interval [A'^^ p) 

and {p, A^-^^] contains respectively at least i and j eigenvalues of where 



< AS < 



<p< AS^) < A^"^ < 



{L) 



are the positive eigenvalues of the {m + k) x [m + k) matrix pencil 
(3.12) 



H 


C* " 


-A(^) 


"I " 


C 


Ni _ 


Ml 



where Mi = -(Ni - N2) 



Ni 
N2 



P 

C(H-pI)-iC*, 



12- 



and W is any positive- definite matrix satisfying W > W = Q2K -^Q2 
p is an eigenvalue of ( 3.1^ ) with multiplicity k. 

Goerisch discovered this approach and developed a very flexi- 

ble framework for applying this critical approximation step for the original 
left-definite Lehmann formulation ( 2.16| ) in a PDE setting. He called it the 
{X, b, T} method (referring to an auxiliary vector space X, an auxiliary bi- 
linear form b, and an auxiliary linear operator T that he introduces) but most 
others refer to this approach simply as the Lehmann-Goerisch method. To give 
a simple example, suppose a lower bound to K is known: k||x|P < x*Kx, and 
suppose we have obtained an approximate solution Z2 to the matrix equation 
KZ = Q2. Let R = Q2 — KZ2 be the associated residual matrix. Then one 
may verify that 

W = Q^K^Q2 = R*K^R + Z^R + Q^Za 

< -R*R + Z*2R + Q*2Z2 =^ W. 
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Note that W contains the nominal estimate of W, Q2"K ^Q2" = Q2Z2; 
together with correction terms that ensure W > W and that can be made 
small by solving KZ = Q2 more accurately. 



4. A Left-Right Comparison 



For the general eigenvalue problem ( |1 . 1[ ) , application of either right- or 
left-definite Lehmann bounds involve solving linear systems having either M 
(for right-definite problems) or K (for left-definite problems) as a coefficient 
matrix. If one system is very much simpler than the other (e.g., if M = I) 
one may feel compelled to choose the simpler path. But is there a difference 
in accuracy ? Goerisch and coworkers in Braunschweig and Clausthal (see 
for example, |Q and Q) have observed that for many applications in PDE 
settings, left-definite Lehmann bounds often were superior to right-definite 
bounds — even if an extra level of approximation is included as described in 
Theorem 3.2. Along similar lines, Knyazev [10| has produced error estimates 



for Lehmann methods that suggest left-definite bounds might be better than 
right-definite bounds asymptotically. 
We explore this issue here. Define 

Jo = Ho - yoHi, Jl = Hi - /?H2, and J2 = H2 - /9H3. 

The matrix pencils associated with ( p.(^ ) and ( 2.16| ) may be written as 

(4.1) Ji-i?(Jo-pJi) 

and 

(4.2) Ji-L(Ji-pJ2) 

for right-definite and left-definite problems, respectively. 

The following lemma and theorem incorporate some unpublished results of 
GoerischQ. 



Lemma 4.1. Let G 

negative eigenvalues. 



Jo Jl 



p2mx2m 



G has no more than r — 1 



Proof: Suppose that G has r or more negative eigenvalues. Then there 
is an r-dimensional subspace Z of M?"^ such that z*Gz < for all z G 2 with 
z 7^ 0. Define the linear mapping T : Z ^ by 

m m 

T{z) = -^.Kp, + Zt+mMpi 



i=l i=l 



Elementary manipulations verify that for z ^ Z with z 7^ 0, 
(4.3) z*Gz = T(z)*M"^T(z) - pr(z)*K"^r(z) < 



^Friedrich Goerisch died suddenly in 1995 after a brief illness. The loss of his passion and 
insight is still deeply felt among his colleagues and friends. 
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implies that z = 0, so null{T) = 
so ([4 .31) implies 



In particular, this means that T(z) 
and rank{T) = dim 2 = r. 

Since K is positive-definite u*K~^u > for all u G 
u*M"iu/u*K"iu < p for all u G Ran{T) with u 7^ 0. 

Now A is an eigenvalue of (|1.1D if and only if it is also an eigenvalue of 



Mr^^r = AK-^v, so by the min-max principle 



mm max — — < max — — 

AimV=r u£V u'K^^U ueRan{T) U*K 



< P 



which contradicts A^-i < p < Xr- Thus, dim 2 < r. 



Theorem 4.2. If the harmonic Ritz value A^-i from ( \1.4 ) satisfies A^-i < p 
then left-definite Lehmann bounds will be uniformly better than right- definite 
Lehmann bounds: 



(4.4) 
(4.5) 



for k = 1, 
for i=l, . 



m 



- 1 

r + 1 



Proof: To show that ( [4.4| ) and (4.5) are true, it is sufficient to show 
that Lfc < 1 + pRk for k = 1, 2, . . . ,r — 1 and that 1 + pR-i < for 
i = 1, 2, ...,m — r + 1 From ( |4.1| ), one finds that 1 + pR^ and 1 + pR-g are 
eigenvalues of 



(4.6) 



[l + pR){Jo-pJi). 



Since A,._i < Aj—i < p, both Jq and Ji have r — 1 negative eigenvalues. This 
implies that both ( [4.1D and ( [4.2| ) have r — 1 negative eigenvalues. Premulti- 
plication of (4^) by JiJg ^ yields an equivalent matrix pencil: 

Ji-(l+/ji2)(Ji-pJiJo'Ji) 

Consider 



G 



" Jo 


Jl 




I 


" 






J2 




JlJg ^ 


I 





Jo 









Jq^Ji 



2 — JlJg Jl 

By the lemma and the Sylvester law of inertia, Jg J2 — Ji Jq ^ Ji can have no 
more than r — 1 negative eigenvalues. Since Jo has exactly r — 1 eigenvalues 
by hypothesis, J2 — JiJq ^ Ji must be positive semi-definite and 

< x*(Ji - /9J2)X < x*(Ji - pJiJo ^Ji)x 

for all nontrivial x. Hence, for k = 1, 2, . . . , r — 1, 



1 + pRk = min max 



X*JiX 



dimS=k xG5 X*(Ji — /oJiJq ^Jl)x 



> min max 



X*JiX 



dim<S=A; xe5 X*(Ji — pJ2)x 
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and for i = 1, 2, . . . , m — r + 1, 

— (1 + = mill max — r 

dim5=^ xG5 X*(Ji — pJlJg Jl)x 
. -X*JiX 

> mm max—- — = —L-i. 

dim5=^ xG5 x'(Ji — pJ2)x 

Since there will be subspaces of dimension up to r — 1 for which x*Jix < 
and subspaces of dimension up to m — r + 1 for which x*Jix > 0, we may 
restrict ourselves to x for which the numerators in the above expressions are 
strictly negative with no loss of generality. ■ 



5. A Ritz-Lehmann Comparison 

One may hope that the role spectral mapping played in the derivation of 
both left- and right-definite variants of Lehmann's method might lead to signif- 
icant improvements beyond the straightforward application of the Rayleigh- 
Ritz method. Indeed, spectral mapping has been used for some time with 
Lanczos methods (e.g., Q) with sometimes spectacular effect and so encour- 
aged, some have considered the use of right-definite Lehmann bounds using 
Krylov subspaces generated in the course of an ordinary Lanczos process (e.g., 
[1l5| and By and large, results along these lines have been disappointing 

when compared with what "shift-and-invert" methods offer (albeit at a much 
higher price). One may instead seek to compare the expected outcomes of 
Lehmann methods with those of Rayleigh-Ritz methods. Observe that each 
method makes optimal use of the information required in the sense that no 
better bounds are possible with the infomation used, so in a certain manner 
of speaking we are really comparing the utility of various types of information 
in extracting eigenvalue information. 

Zimmerman proved that the error in left-definite Lehmann bounds is 
no worse than proportional to the error in Ritz bounds and may be smaller. 
Thus, left-definite Lehmann bounds carry the potential of greater accuracy 
than Ritz bounds. We probably shouldn't expect them to be much better. 



though. In [|10[, Knyazev states that eigenvector approximations provided by 
either the right- or left-definite variants of Lehmann's method will asymptot- 
ically approach the corresponding Ritz vectors as they close upon the true 
eigenvectors. Thus, Lehmann methods appear to recover invariant subspace 
information with about the same efficiency as Rayleigh-Ritz methods. 

It is important to note that Lehmann methods provide eigenvalue hounds 
that often are difficult to obtain in other ways. For example, Behnke 
combined right-definite Lehmann methods with interval techniques in order 
to deduce guaranteed bounds to matrix eigenvalue problems and his approach 
appears to be competitive with the best known interval algorithms for this 
problem. 
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For the remainder of this section, we wih consider the appUcation of a left- 
definite Lehmann method within a Lanczos process for resolving a large-scale 
matrix eigenvalue problem. Since left-definite Lehmann methods are known to 
be superior to right-definite Lehmann methods (at least to the extent claimed 
in Section 4), one may seek to improve upon the results of Morgan by 
using left-definite Lehmann-Goerisch bounds as formulated in Theorem |3.2| . 

Specifically, let M = I in and let T be a tridiagonal matrix that is 

similar to K - so that K = QTQ* for some n x n unitary matrix Q. For any 
index 1 < i < n, let denote the ith principal submatrix of T: 

a2 



ai 
Pi 



P2 



/?2 as 



and define V via a partitioning of T as 



V 



Let denote a matrix containing the first i columns of Q: = [qi, ... q^]- 
The Lanczos algorithm builds up the matrices T and Q one column at a time 
starting with the vector qi. Only information on the action of K on selected 
vectors in M" is used. Different choices for qi produce distinct outcomes for 
T, if all goes well. Extracting useful information when not all goes well is 
fundamental to modern approaches -~ a discussion may be found in [16|. 
At the ith step, the basic Lanczos recursion appears as 

KQ^ = Q^T^ + Pefli+ie\ 

In exact arithmetic, the first £ steps yields a matrix that satisfies Q^Q^ = I 
and 

Ran{Qe) = span{qi, Aqi, . . . , A^"^qi} = /C^(A,qi), 

a Krylov subspace of order i. The application of Theorem |3.2| is straightfor- 
ward: 

Theorem 5.1. Let M = I and suppose p is not an eigenvalue of ( [j. Each 
interval [A^-\ p) and {p, A^-^^] contains respectively at least i and j eigenvalues 
of the matrix K, where 



< A 



< K^^l < AL7 <p< ^'i' < < 
are the positive eigenvalues of the tridiagonal matrix pencil 



(L) 



(5.1) 



Pkei 



UJ 



fc+i 



-'k^k-^k 



fc+1; 







+1 



(p) 
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where uJk+i is any number that satisfies 

and 6k+i{p) = e^fcT^^(Tfc - p)"^efc. 
Note that p is a simple eigenvalue of 



We apply this directly to the numerical example considered in [^] and in 
Section 1. Figure 2 shows the convergence history both for Ritz bounds and 
for left-definite Lehmann bounds, for the seventh through tenth eigenvalues of 
the matrix. We also apply a shift and invert Lanczos method using the spec- 
tral transformation A ^ few features are apparent. The first is that 
the Lehmann bounds aren't nearly as good as the shift and invert bounds to 
which they are closely related. Paige, Parlett, and van der Vorst [0] observed 
this disappointing behaviour for right-definite Lehmann methods (in their con- 
text, harmonic Ritz on a shifted matrix) — the left-definite Lehmann method 
does not fare much better. Knyazev's observations pO[| relating convergence of 
Lehmann eigenvectors to Ritz vectors suggest that spectral information for in- 
terior matrix eigenvalues will not be picked up any more rapidly with Lehmann 
methods than for Ritz methods. This is in stark contrast with shift and invert 
strategies which will produce approximate eigenvectors that are rapidly drawn 
into invariant subspaces associated with eigenvalues close to p. 

The second observation is that, nonetheless, the Lehmann bounds do ap- 
pear to approach the exact eigenvalues at a rate comparable to that of the 
Ritz bounds — consistent with the results of Zimmerman discussed above. 
Furthermore, one can see that the Lehmann bounds appear to pass through a 
series of stagnation points en route to their limit, and the farther they lie from 
p, the more abrupt the transition between stagnation points. These stagnation 
points appear to be close to the exact matrix eigenvalues. 

The following simple Bauer-Fike style perturbation result lends some insight 
to this behaviour. 



Theorem 5.2. Let A^^) be any left-definite Lehmann bound and denote with 



Aj the Ritz values from (Li)- Then 



<-) < 11-11 ii-ii'^ 

Proof: If either (H-pI) or (H-A(^)I) is singular then ( |5.2D holds trivially. 
Suppose then that (H — pi) and (H — A^^^I) are nonsingular. Rearrange the 
expression ( |3.6| ) to get 

y = -pA(^)(H - pI)-HH - PS^h)-^ir^&WCy 

Take norms on each side and simplify: 

(5.3) 1 < pA(^)||(H-pI)-HH- A(^)i)-1h-1|| ||W|| ||Cf 
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* = Rit2 Values 



« 




= (Left-Def) Lehmann Values <> 



= Shift S Invert Lanczos 



(spectral shift) 



<> 





O <> 



-0- - ^ - 
« 



0/ 



_ ^> - -0^^, 



O O V 



0^ 



s 

ft** a*** 
<i * 



, ft** 



> 4 ft 



v**ft 



i> S * 
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10th 
9th 
8th 
7th 



15 20 25 30 35 40 45 

Subspace Dimension 



Figure 2. Convergence of Ritz and Lehmann bounds using 
Krylov subspaces vs. Shift & invert Lanczos with same starting 
vector. 



Then notice that 

IKH - ,I)-(H - A<"I)-H-|| =,„ax (^) (^^) i- 

= l/min(|Aj-H|Ai-A''^'|Ai), 

i 

which may be combined with ( |5.3| ) to g et (|5.2[ ). ■ 

Notice that the right hand side of ( |5.2| ) has a magnitude related to the 
size of the Ritz residual KQi — QiH and is independent of which Lehmann 
bound A*^^) is chosen. Suppose the right hand side of ( |5.2| ) is moderately small 
and choose a Lehmann bound A^^^^. If A^^^ is not close to p then any Ritz 
value Aj that is close to A^^^ will not be close to p either. Thus any 
chosen far from p is constrained by ( ^.21) to be nearer to at least one Aj then 
it would be were A^^^ chosen closer to p. A qualitative interpretation that one 
might take from this is that Lehmann bounds A^^^ far from p tend to occur in 
the neighborhood of Ritz values Aj. Furthermore, Lehmann bounds A*^^^ far 
from p that are also situated toward the edges of the spectrum will tend to 
aggregate in the neighborhood of exact eigenvalues since the attracting Ritz 
values themselves will be approximating extreme eigenvalues fairly well. 
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